Freaks Macintosh Archive

home *** CD-ROM | disk | FTP | other *** search

/ Freaks Macintosh Archive / Freaks Macintosh Archive.bin / Freaks Macintosh Archives / Textfiles / zines / DNA / DNAV1I9.sit / DNAV1I9 / DNA109.013 < prev next >

Wrap

Text File | 1994-08-24 | 37KB | 740 lines

_ _/ \_ _/ \_ _ _/ \_ / _ \__ _/ _ \_ _/ \_ _/ _ \_______/ \ | \_ | / \ _______/ _ \_ / _/ \ \_ | |\_ \_ | | _/ / \_ \ \ / | |----\_ \_ | \_ \_ | |_/ _/----| | \ / | | \_ \| \_ \_| / / | | | | _/ || \|| \_ | | | | _/ _/ | | \_ \_ | | / \_ | |__/ _/ | | | | \_ \__| | _/ \ \_ \_/ ______/\_/ | | \_/\_______ \_/ _/ \_ _/ \_ _/ \_ _/ \_ _/ \_/ \_/ \_/ \_/ Assembly Tutorial 1 Welcome back to DNA magazine! We've been out of publication for quite a few months now, and I've had plenty of time to write *really* long articles on my favorite subjects. I've decided to write this article in hopes that some of our readers will try and learn more about a very powerful and rewarding language. I of course am speaking of assembly language. Assembly is a very difficult language to grasp, but one which is very rewarding to the programmer who understands it. Knowledge of assembly language will give you a lot more insight into how upper level languages handle different situations. This knowledge will help you make more efficient and bug free code in everything you do, in and out of assembly. I'm not writing this as a tutorial as how to write a virus, or how to crack software. My opinion on those is that you have to learn how to write your own programs before you can reprogram someone elses. Both virii and cracking are very difficult subjects, and I won't get into them at all in this document. There is more than enough to talk about without even touching on those. You've been warned: this text contains constructive information only! If you've ever looked at assembly langauge source code, you'll realize that there's just a whole list of commands all looking exactly the same. This tremendously long string of similar instructions are in a seemingly random sequence. Yet there are those who claim to understand this cryptic language, and create very fast, and sometimes very impressive applications with it. What I plan to teach you are some of the more basic, but very useful applications of assembly language. LANGUAGE OVERVIEW For those of you not familiar with programming languages, I will go over the basics of what led up to modern day programming. Really though, if you don't know any programming languages, assembly is NOT the first on you should attempt to master. In that light, this is presented as and overview only. First of all, when the first computers were created, there were neccesarily, no programming languages. This presented quite a challenge for those who wanted to do something with their new expensive computer. The computer doesn't speak english; quite to the contrary, it speaks a binary language that humans are not familiar with, and is difficult to translate. In order to program an early CPU, one had to write everything in binary and send it individually to the CPU for execution. This first level of language is called machine language or machine code. It's absolutely garbage to look at. None of it makes any sense. If you've ever looked at the hex dump of an executable program, you know what I'm talking about. The next step up was the creation of assembly language, which introduced nmemonic operations to represent the machine code instructions and labels for the jumps in code. While this may sound foreign to you, that's OK, I'll explain it later on. The next invention was that of high level languages which kept the programmer out of the low level aspects of program design. In C for instance, the programmer never knows what value is in a register, and doesn't really care. He can code things with impunity and expect the compiler to create the machine code to do what he asks. There are other classifications for languages, but it's not really important. To become a good programmer, you must understand the three that I've outlined, machine code, assembly language and high level languages. You'll need to know how to code proficiently in the latter two. CHOOSING A LANGUAGE Rarely, if ever will you have to program anything in machine code directly. Once in a while you will have to throw a machine code instruction into your programs in assembly. Typically, you don't want to code an entire application in assembly. High level languages handle most of the work associated with I/O on the computer, with much better precision than you will in assembly. I'm sure you've heard that assembly language is the fastest, and most compact langauage there is. While this is true, but you must also consider development time when you choose your language. Writing all of some large application in assembly is a task no one should have to take on. Functions such as taking input from the user and writing info to the disk can all be done just as efficiently with a high level language. It will take you 20 times less writing time to make a high level language do what you want, and it will achieve the same result as a homemade assembly routine. Therefore, the structure of your code is most efficiently maintained in some high language. My personal preference is C, but whichever one you prefer is fine by me. C is just the most straight forward for introducing assembly code to. I'll get into interfacing at a later time. 80x86 INTERNALS Now that we've got the introductions out of the way, I would like to start on the basics of Intel x86 based machines. First of all, the smallest unit you can address directly on an Intel machine is a byte. A byte is a series of 8 bits in a sequence. It is possible to change just a single bit, but not directly. Each bit can have a value of one or zero. If an individual bit is a one, it is said to be 'set'; if it's a zero, it's said to be 'cleared.' The next size up from the byte is a word, which is two bytes end to end. A word is 16 bits, rather than 8. The next larger common size is a double word, which is 2 words, or 4 bytes, or 32 bits. Rarely is anything adressed in increments of more than 32 bits. For floating point calculations, there are larger storage locations, but they are beyond the scope of this introduction. Intel machines are bizarre in a number os ways. Not the least of which is how it stores values in memory. As it turns out, the Intel line of processors don't actually store words or doublewords in the same order you would expect. The processor, when it stores the data in memory or in a register actually stores it in byte reversed format, often called little-endian. This isn't a big problem when you're writing code, since you never see directly where a variable is stored, but you simply need to be aware of this, particularly when you're debugging code. A memory dump in your debugger will look pretty strange to you if you don't remember that the values are stored in byte reversed format. What does this mean? Let's say you want to store a 16 bit(word) value to memory. You write it to memory, and take a look at a memory dump. Rather than being written in a row like this(separated into two bytes with pipe symbol): 0 1 2 3 4 5 6 7 | 8 9 10 11 12 13 14 15 In fact, it's written like this: 8 9 10 11 12 14 15 | 0 1 2 3 4 5 6 7 Pretty weird way of doing things, but apparently Intel had a *really* good reason for doing this. It's never been explained adequately to me however. I chalk that up as just one in a tremendous string of Intel quirks. NUMBERING SYSTEMS AND BASES Understanding different bases is essential to good assembly language programming. Humans are very used to a decimal numbering system. It is more specifically referred to as base ten. This means that there are ten distinct elements in each place of a decimal number. These elements are the numerals zero through nine. Two other common bases are used in programming: binary and hexidecimal. Binary is what the computer uses at the very lowest level to represent all information. Binary digits have only two distinct numerals: zero and one. It looks like this: 100100011111... This is what the computer uses, but it's nearly useless to humans. If I were to ask you to tell me what that number above were in base 10, you'd be very hard pressed to tell me in under thirty seconds. That's why the hexidecimal numbering system is used. It's the step inbetween decimal and binary. Hexidecimal is base 16. It was chosen, since everything is done in increments of 8 bits(one byte) on intel processors. Each byte can be represented by a two digit hexidecimal number. A two digit hex number can have the values zero through 256, just like an 8 bit binary digit(one byte) can. Hexidecimal has 16 individual elements defined as zero through nine and 'A' through 'F'. 'A' and 'F' are the decimal values 10 and 15, respectively. The letters inbetween are the values inbetween 10 and 15. True to all numbering systems, the places of a hexidecimal digit go up in factors of the base of itself. This may sound confusing, but it's not. The first digit of any numbering system is always the ones digit. From there on out, you multiply the previous value by the base(in this case hexidecimal: 16) to find the values of the successive decimal places. In a hex number, the first digit is the one's place, the second the 16's place, the next is the 256's place and so on. On a binary number, the places are successive factors of two. The first place is always a one, the second two, the third four, the fourth eight and so on. The same is true of decimal numbers, but you probably know how to read them already. To read a hex or binary number, you need to multiply each digit by it's place value. Let's do a sample hex value: 3E87h. Typically, hex values are followed by an 'h'. It's done simply to help you from getting confused about which base the number is. It can be either upper or lower case, but the latter is more common. Let's pull that number apart with their places: ------------------------------------- Place: 4096 256 16 1 ------------------------------------- Digit: 3 E 8 7 3*4096 14*256 8*16 7*1 ------------------------------------- 12288 3584 128 7 = 16007 Reading and writing Hex numbers is an important part of learning assembly. Much of debugging of code requires that you convert between bases. So if you want to learn assembly, be sure to get this subject down. REGISTERS, MEMORY AND STORAGE In order to do anything important a CPU has to be able to store data somewhere. There are a number of places that data can be stored. The most simple to access are in the CPU itself, called registers. The general purpose registers on the 16 bit x86 machines are labeled 'AX', 'BX', 'CX', and 'DX'. These are the four registers that you will use the most often. There are others which tell you and the processor where to look for certain pieces of information. Registers are the fastest place to read and write information from, since the CPU doesn't have to call any external components in your computer in order to read to them or write to them. The registers have other important uses in calling DOS and BIOS functions. Typically, they are used to send data to and receive data from these functions. For instance, you would put data into the AX register that the DOS or BIOS function would then use to do some function, then return a value in another register. You could then test the data, and depending on it's results change the flow of your program. Then next level of storage is base memory. The memory resides off the actual CPU, which means to access it, your application has to pull info off the data bus, which makes access substantially slower than that of registers. Intel machines have another quirk in how they access memory; they use a segmented structure. In standard 16bit mode, which is all that DOS convieniently supports, memory is addressed using a 16bit segment and a 16bit offset. In programs, you denote a memory location in the form 'segment:offset'. You can use data or registers to point to the memory you want to use. For instance, you can look at '1234:5678' as a memory location, or you can use a register for one or both the segment and offset('1234:BX' addresses the memory at segment 1234 and the offset of the value currently in the BX register). To access adjacent memory locations, you simply need to increment or decrement the offset by one. Adding one to the offset always takes you to the next byte of memory in a particular segment. Let's say you wanted to write data to five memory locations, which all happen to be in a row, like a string. You would first put the data you want to write to the memory locations in a register. Then you would move that data to your first offset, say: 'xxxx:0000'(It really doesn't matter what segment you are in, this works for all cases). Then to write the piece of data to the next location, you would then write to 'xxxx:0001'. Typically, you don't have to explicity name which segment you want to draw info from, because they are implied depending on what kind of an operation you are doing. There are four segments you can address at any given time in any program. To reach them, you have to tell the CPU where they are in memory. In order to do this, you have the four segment registers loaded with the address of the four segments. The four registers are named CS, DS, ES, and SS. They stand for Code, Data, Extra, and Stack segments. They are self explanatory. They code segment is where the code resides that you want the processor to run. The data segment is where you store all your data variables. The extra segment doesn't have any designated job. It has a variety of uses which we will discuss later. The stack segment is where all the stack data is stored. The stack is a specialized data segment which we will also discussed later. When you do an operation that is associated with a given segment, you don't have to tell the processor which segment you want it to access. It knows implicitly which one to draw from. For instance, if you say: MOV AX,variable You don't have to tell it you want to draw that information from the data segment. The processor knows you want to pull a variable from data. You are allowed to over-ride this default however. If you have data stored in the ES, you can get at it thusly: MOV AX,ES:variable The CPU itself doesn't actually use this segmented convention. When you send the processor the segment and offset, it converts that into a 20 bit physical memory location. This being the case, in 16 bit standard mode, the maximum amount of memory that the processor can directly access is 1,048,576(2^20) bytes of memory. This is where the DOS memory limit comes from. With the advent of 32 bit processors, the limit was drastically expanded to around 64 terabytes. Unfortunately, DOS, doesn't support 32 bit programs particularly well; so you're stuck writing 16 bit programs until a solution is devised(Windoze 4.0) Rest assured that the segmented memory usage of the Intel line of processors is one of the most difficult aspect of the machine to master. If you are confused about segments, don't feel alone: thousands of people have been cursing Intel for choosing segmentation for years. Permanent memory is different from registers and memory in that it retains the info stored on it after the computer has been turned off or rebooted. Hard drives, floppies, CDROMS, tape backups and the like are all permanent memory. Assembly code access to these objects is often difficult and obscure. Reading from a hard drive and the like is done much more easily in high level languages, so I won't talk about it here. If you obsolutely need to access permanent memory, find a book that talks about DOS interrupts to do so. Now that we have an general understanding of memory on Intel machines, let's look in more detail at CPU registers. Compared to other machines(The new PowerPC chips have around 30 general purpose registers!), Intel processors have relatively few regs. I've mentioned the general registers: AX,BX,CX,and DX. Each is slightly specialized for doing a certain task, but are still flexible enough to allow generalized work. Their pseudonyms are Accumulator, Base, Counter, and Data registers. Each of these four general purpose registers can also be addressed one byte at a time. Each of the registers can store two bytes, so there is an upper and a lower byte in each. To address only a single byte, the registers are split like this: AX is split into AL and AH for the lower and higher bytes. The BX into BL and BH. The CX into CL and CH. The DX into DL and DH. The next four registers are the segment registers(CS,DS,ES,SS) which point to the memory location of their respective segments. There are three pointer registers named: BP,IP, and SP. The BP(base pointer) is not used much, but is associated with the stack. You'll find that high level languages use the base pointer a lot, but that you can generally ignore it. The IP(instruction pointer) tells the processor which instruction it is to execute next in the code segment. Each time an instruction is loaded into the processor, it increments the IP to point to the next instruction. The SP is the Stack Pointer. It tell the processor where the top of the stack is at in the stack segment. The last two registers are the SI(source index) and DI(destination index) registers which are often used in string operations. They are also used in indirect memory addressing, as shown above. There is actually one more register on an 8086, called the flags register. The flag register is used to detect certain things in the code. This register holds information temporarily after a comparison has been made, for instance. When you compare the value in a memory location with a number; if they are equal, the zero flag is set. The zero flag(ZF) is a single bit in the flags register. Other operations check to see the status of that flag, and change the course of the code appropriately. MOVING DATA WITH 'MOV' How is data moved around with assembly language? The most common instruction by far is the 'MOV' instruction. Despite being the pseudonym for move, it in fact copies data from one location to another. The MOV instruction takes two arguments, one of which must be a register. The other can be either another register, memory, or an immediate. An immediate is data you actually hard code into your source. If you ask the computer to put a 5 into the AX register, the 5 is an immediate. The fact that one argument must be a register means you can't simply move data from one memory location to another with this instruction, you have to use a register as a middleman. First you must move the data into a register and then move that data in the register into the new memory location. Here are some examples of how to use the MOV instruction: MOV Destination , Source ;General form MOV AX,5 ;Put the value 5 into AX MOV AX,CX ;Copy the data in CX to AX MOV BX,memory_location ;Copy data from memory to BX The name, 'memory_location' above was a variable that was declared earlier in the program. You can access variables in many of the same ways you can in higher level languages. Just like most languages except basic, you do have to declare your variables explicitly. With assembly language there are a variety of ways to address the variables however. The first one, which you way above is called direct addressing. When assembled the code will call a specific memory location, and take the data directly out of it to put into the BX register. Direct addressing is the simplest form of addressing except for immediate. The other form of addressing is indirect, which is similar to using pointers in C. By putting a register in brackets, you mean to say that you want to access the memory pointed to by the contents of the register(similar to the * indirection operator in terms of pointers in C). Here is a simple example of indirect addressing: MOV BX,offset memory_location MOV AX,[BX] This example puts the offset of the desired variable into BX, then uses the indirection operator to load the address pointed to in BX into AX. The end result is that the value stored in the variable 'memory_location' is put into AX. There are other more powerful methods of indirect addressing. The one above is called register indirect addressing. The other modes are 'based'(or indexed) and 'based and indexed with displacement.' Examples follow: MOV AX,[BX] ;Register indirect MOV AX,[BX+2] ;Displaced MOV AX,[BX+SI+2] ;Based and indexed with displacement All are valid ways of addressing memory. You can also turn those around to move data from the register into memory: MOV [BX],AX MOV [BX+2],AX MOV [BX+SI+2],AX The last of the examples above([BX+SI+2]) simply adds the values in BX and SI and adds two, then stores the data pointed to by that value in the AX register. These are all difficult to understand concepts, and you won't have to use them until you start trying to solve very difficult problems in assembly. SIMPLE INSTRUCTIONS: OR, AND, XOR Now that we've seen the basic ways of moving data around in various storage locations in the computer, we can start exploring ways of manipulating data to get useful results from it. Some of the simplest data manipulation instructions are the bitwise operators: OR, AND, and XOR. If you've used pascal, these are not the same as the similar instructions there. They work on the bit level to change the value of a single byte, word or dword. For instance, if you have the decimal value 170, which is 10101010 binary, and you AND that with 00001111, you will simply cut off the top 4 bits of the original number like so: 10101010 AND 00001111 -------- 00001010 AND works by checking each bit in the source byte with the second byte and checking if both bits are set in each position. If, and only if both bits are set, does the destination bit get set. This is a very handy function to know. This means that if you want to test whether an individual is set, all you have to do is use an 'AND mask.' The mask is the bit values you use to get rid of excess information. You simply set a bit in the mask if you want to test that particular bit. For instance, if you want to test if bit number seven to check if it's set, you would use the mask: 10000000, since bits are ordered from right to left, starting at zero. XXXXXXXX AND 10000000 -------- X0000000 Only the value of the bit we don't mask comes through the AND. We can then test whether the resultant byte is a zero or not. If it's a zero, we know the orginal bit 7 was a zero. If it's greater than zero, then bit 7 was a one. The OR function works very similarly: 10101010 OR 00001111 -------- 10101111 This simply works by checking to see if either one or both the bits in a certain spot are a one, and if they are, the resultant bit is also set to a one. Only if both the bits are zero will a zero be the answer. It's useful if you want to ensure that all the bits from the source byte get into the destination, perhaps with some additional bits set as well. The last of the logical operators is the XOR expression. This one operates similarly to the OR operator except that the resulting bit is set if and only if a single bit of the two compared expressions is set. This example explains: 11110000 XOR 10101010 -------- 01011010 If and only if there is a difference between the two bits will the resultant bit be set to a one. You will often find this used in extremely simple encryption algorithms. If you use a certain bit pattern to XOR a string of text, XORing it again with the same pattern will return the original text. In the example above, XORing the resulting byte with the original pattern, '10101010' will return the original byte: 01011010 :Result from above XOR 10101010 :Original pattern from above -------- 11110000 :Original byte from above The three operators above are extremely important to remember, because you will use them all the time with assembly to mask and set bits. SHL, SHR These two other bitwise instructions come in quite handy as well. These are SHL and SHR. They stand for Shift Left and Shift Right. These operators are used to move a series of bits in the direction indicated(either left or right). The shift expression requires two arguments for use. The first one is the register which you wish to shift, and the second argument is how many bits you wish to shift it. This is an example: MOV AL,10101010 (binary) SHR AL,1 -------------- 10101010 --> -------------- 01010101 You can see that it simply shifts the bits to the right, and fills in the opening in the 8th bit with a zero. As it turns out, you can't specify a shift of more than one bit at one time with immediates: SHL AX,3 ;Doesn't run on pre-286 machines! This is a no-no on Pre-286 machines. You can only tell the processor one shift at a time using immediates: SHL AX,1 SHL AX,1 SHL AX,1 This does the same thing as the above example, and it has the added benefit that it will actually run without crashing your computer old XT's. While you can't use immediates, there is a way to tell the computer to shift more than one bit at a time: using the CL register. You set the number of times you want a register shifted in CL. Then you tell the processor to shift a register CL times in whichever direction you choose: MOV CL,3 SHL AX,CL I don't know why this restriction was placed on this command, but Intel does seem to do some 'unique' things; so I just accept it. Once you move on to programming in higher level processors, you can ignore this quirk. As it is, you can probably program in 286 mode, since very few users have 8086's any more. Make sure you include code which detects the processor, so it doesn't just crash on 8086 machines. Keep in mind that if you are using a shareware assembler, such at A86, you can only program in 8086 mode. If you're using TASM on the other hand, you can get around many of the quirks of the 8086 processor by programming in 286 mode or above. Back to programming! The shifting operators have an interesting effect upon the register values they modify. When you shift a register to the left one place, you multiply the value inside it by two. This occurs for every bit you shift. Shifting 3 bits to the right multiplies the value by 8(2*2*2). Shifting to the right has the opposite effect. It will divide the register's contents by two for every bit. This is a very handy function to use, since it is much faster than actually using the multiplications operations provided with the Intel machines. If you know you're going to multiply something by a binary value: by all means use a shift instead of a multiply. The multiply operation on the Intel machines is much slower than simply shifting. Typically, shifting even multiple places takes only a single clock cycle to complete on 386 class machines(thanks to the barrel shifter). The multiply instruction, on the other hand, takes somewhere in the neighborhood of 30 clock cycles to complete. Coding in assembly means you want to write the fastest smallest code possible. Using this technique will help you dramatically. I use this technique in my graphics programs, which I will show later. MUL Now that I've covered a limited type of multiplication, I'll go into detail with the Intel general purpose unsigned multiplication instruction. The fact that it's unsigned means you can't use it if you want to be able to include negative values in your work. You will find relatively limited use for negative values anyway, so you should find this is adequate. If you want signed multiplying, look at the IMUL instruction(It's more complicated, since you have to know two's complement notation binary encoding). The original MUL instruction takes input in the form of either a byte or a word. The 386 instruction allows for doubleword multipliers. With an 8086, you always multiply a memory location and either the AL or AX registers. When the data from the multiply is stored, it is stored in twice the amount of space of the two multipliers. If you multiply together two words, the maximum result is 64k, which you cannot store in only 8 bits. Therefore, it stores it in a word(specifically, the AX register). If you give the instruction two word size multipliers, it stores it in the combo 'DX:AX', with the AX register holding the lower 16 bits of the answer. If the answer fits in the same amount of space as the multipliers, the Carry Flag(CF), and the Overflow Flag(OF) are both cleared. If the answer spills into the upper half of the answer area, both the flags are set. In english? If you multiply together two byte size numbers, and the answer can also fit into a byte, the CF and OF are clear. Otherwise, the flags are set. In the case of the word operands, if the answer can be stored completely in the AX register, then the two flags are clear. If the answer also requires that you read the DX register, the flags are set. There are two forms of the instruction you can use. One of the multipiers is always the AL or AX register, and the other can be either a general purpose register(besides AX), or a memory location. Let's do some examples: MOV AL,3 MOV BL,3 MUL BL JC large This example LODSx, MOVSx, STOSx There are two versions of LODSx, MOVSx, and STOSx on the early Intel machines, and a third one was added with the advent of the 386. Replace the 'x' in the above opcodes with either a 'b' for byte, a 'w' for word, or 'd' for dword(on 386 and above). Now let's get to what they do. The LODSx instruction is an acronym for Load String Byte(or Word, or Dword). This instruction looks to see what the registers DS:SI point to, and load that value into the AL register. If You use the LODSW, it load the info into the full AX register. Using the LODSD requires you have a 386 or above, and you have to use the 32 bit extended registers. I won't get into this, because there is simply too much to talk about. If there is a request for more info, I'd be happy to write a 386 assembly tutorial. Once the information is loaded into the register, the SI register is incremented by the size of the data was you loaded(1 for byte, 2 for word, 4 for dword). This makes this series of instructions handy for doing repeated loads, since it reads in all the information in a segment in a row, if put into a looping statement. The MOVSB instruction is the only way to move data from one memory location to another directly. You of course can use the standard MOV instruction, but you have to use a middleman register like this: MOV AX,memory_loc_1 MOV memory_loc_2,AX The MOVSB instruction is a very handy way of moving large chunks of data from one place to another location. It takes no arguments. It looks at the registers DS:SI, and moves the data pointed to there into ES:DI. It also increments the values of SI and DI by the length of the data moved(1 for byte, 2 for word, 4 for dword). This makes this a quick way of moving data, since once the move is started, you can keep repeating the same instruction to move large chunks of memory. Since you have to set up several registers, this method may look complicated, but it is truly the fastest way to transfer information in any language. Once set up, you can put it into a loop to move as much as a whole segment(64k) of memory anywhere you want. It will take about .00262144 seconds on a 386- 25. For instance, to write to a color screen in text mode, you must write to the segment '0B800h'. This is some sample code to write a single letter to the first position on the screen(upper left corner): MOV AX,@DATA; Should be done already. MOV DS,AX ; Quirk: can't MOV immediates to segment regs MOV AX,0B800h; Assumes you have a color monitor MOV ES,AX ; Segment registers are set up MOV SI,offset string_variable MOV DI,0 MOVSB Let's examine what this code does. First, it loads the appropriate values into the segment registers. It puts the segment where your variables are stored into DS. Then it moves 0B800h into ES, which is the segment where the screen is written to. You may have noticed you can't simply move immediate data into segment registers. This is just another cute little quirk Intel provided for your entertainment. Now that the segments are set, you need to set the offsets for the reading and writing. You point to the string you want to print to the screen by putting it's offset into SI. Then, you point to the beginning of the screen by setting DI to zero. Now, the stosb instruction is invoked. It looks at DS:SI, which is the segment and offset of your string you want printed. It moves the first byte of it to ES:DI, which is the screen segment and the offset of the first position on the screen. The STOSB instruction is the opposite of the LODSB insruction. STOSB puts the byte in the AL register into the spot pointed to by ES:DI. Then it increments DI by the length of the data stored. There are plenty more instructions you can learn, but with just the few that I've described here, you can start writing assembly language routines which do useful work. STRUCTURE Assembly language is a very simple language structurally. There is a single indentation pattern for the whole code, unlike most high level languages like C: if ( a == b ) then { ... } You indent the code differently depending on what part of the code you're in. With assembly, it's all written linearly: MOV AX,3 MOV BX,4 AND AX,BX OR AX,7 With no jumping instructions, assembly routines execute in the order they are written. In order to control the flow of execution, jumping commands have to be put into the code. Using jump statements means you probably want to compare something first, then change the flow of your program depending on the results. Comparing data is typically done with the CMP instruction. Here is some sample code: MOV AX,3 MOV BX,4 CMP AX,BX JA above JB below JE equal This example compares the AX(3) to BX(4), and jumps to the appropriate location. In this case, it will jump to the code marked by below, since AX is below the value of BX. The locations are marked elsewhere in code on their own line followed by a colon, like this: loophere: MOV AX,3 MOV BX,8 DEC BX ;Subtracts one from BX CMP BX,AX JA loophere;JA = Jump if above This code sets AX to 3 and BX to 8. Then, it subtracts one from BX with the DEC instruction, and checks if BX is larger than AX still. If it is, it goes back through the loop again. It will repeat until BX has the value of 3. There are lots of jump instructions, but here are the most common and useful: JE = Jump if equal JA = Jump if above JB = Jump if below JC = Jump if carry flag set JMP = Jump unconditionally I'm afraid this is about all I can provide in one mag article. There's lots more to talk about, I haven't even covered a small portion of the instructions available to you. Basically, the only way you're ever going to learn assembly is if you try writing programs. Often when starting out, you'll need assistance. This is what this article is for. Hopefully you got an idea of what is needed to write assembly language programs. If anyone wants me to write more on this in later editions of DnA, let me know. I'll certainly be running other articles on programming, and if there's any particular subject you want covered, I'd be more than happy to write on it. I did a graphics programming tutorial this month, so check it out if you're interested. I hope you learned something, I sure did. I love to get feedback, and talk programming with fellow coders, so please drop me a line if you have questions, ideas, flames, criticisms, or any other matter you wish to discuss. I'm reachable at Digital Decay only right now. My E-mail account got shut down for over-use of other accounts, but I'll have it back soon I hope. Until then, have a day! Zephyr [Cosys - Digital Decay] August 12, 1994